Pre-defined gene co-expression modules in rheumatoid arthritis transition towards molecular health following anti-TNF therapy

Abstract Background No reliable biomarkers to predict response to TNF inhibitors (TNFi) in RA patients currently exist. The aims of this study were to replicate changes in gene co-expression modules that were previously reported in response to TNFi therapy in RA; to test if changes in module expression are specific to TNFi therapy; and to determine whether module expression transitions towards a disease-free state in responding patients. Method Published transcriptomic data from the whole blood of disease-free controls (n = 10) and RA patients, treated with the TNFi adalimumab (n = 70) or methotrexate (n = 85), were studied. Treatment response was assessed using the EULAR response criteria following 3 or 6 months of treatment. Change in transcript expression between pre- and post-treatment was recorded for previously defined modules. Linear mixed models tested whether modular expression after treatment transitioned towards a disease-free state. Results For 25 of the 27 modules, change in expression between pre- and post-treatment in the adalimumab cohort replicated published findings. Of these 25 modules, six transitioned towards a disease-free state by 3 months (P < 0.05), irrespective of clinical response. One module (M3.2), related to inflammation and TNF biology, significantly correlated with response to adalimumab. Similar patterns of modular expression, with reduced magnitude, were observed in the methotrexate cohort. Conclusion This study provides independent validation of changes in module expression in response to therapy in RA. However, these effects are not specific to TNFi. Further studies are required to determine whether specific modules could assist molecular classification of therapeutic response.


Introduction
Despite the wealth of treatment options currently available to RA patients, none are universally effective. RA patients are initially treated with conventional synthetic DMARDs (csDMARDs) such as MTX. Where first-line therapy is inadequate in controlling inflammation, patients are prescribed a biological DMARD (bDMARD), often in combination with MTX [1][2][3]. The most commonly prescribed group of bDMARDs are TNF inhibitors (TNFi) [3][4][5][6].
In the UK, treatment response to bDMARDs is determined 3 to 6 months following treatment initiation, according to the National Institute for Health and Care Excellence (NICE) guidance [7]. Although successful in treating RA, bDMARDs are ineffective for $40% of patients, and treatment switching is recommended in non-responding patients by 3 or 6 months. However, data from the British Society for Rheumatology Biologics Registry (BSRBR-RA) showed that a significant proportion of patients are cycling through a high number of bDMARDs, with over 20% trying three or more bDMARDs, and 30% trying two or more distinct classes of bDMARDs to control disease without significant adverse events [8].The current trial-and-error pathway means that a large minority of patients are treated suboptimally for many months or even years [1,9,10].
Because radiographic damage caused by uncontrolled inflammation can occur rapidly [11], swift decisions about therapy changes should improve patient outcomes. The discovery of reliable biomarkers of TNFi response would aid more informed treatment strategies, including objective monitoring and the development of predictive tests, thus improving patient prognosis [11,12]. However, to date, biomarker studies of treatment response in RA have not replicated across studies and populations [4].
One notable exception is a transcriptomic study that reported reproducible findings when investigating changes in 27 pre-defined gene co-expression modules in good responders and non-responders to TNFi therapy, across three independent RA cohorts recruited in the USA [13].
Gene co-expression modules were previously defined by Chaussabel et al. [14], following the analysis of 239 peripheral blood mononuclear cell (PBMC) samples obtained from individuals with systemic autoimmune diseases, cancers, microbial infections and liver transplant recipients undergoing immunosuppressive therapy. Modules were characterized after assessing transcript clustering patterns, in addition to identifying functional associations amongst transcripts and genes that were frequently co-expressed in disease.
Using the same gene co-expression modules, Oswald et al. identified module expression in RA patients consistently changed in good responders after 3 months of treatment with a TNFi, but fewer changes were observed in TNFi non-responders [13]. In the current study, we aimed to determine: (i) whether modular changes in gene expression during early treatment with a TNFi are observed in UK-recruited RA patient samples; (ii) whether the effects are drug-specific, by investigating a MTX-treated cohort; and (iii) whether gene expression modules transition towards a disease-free state in responding patients.

Patient cohort
Two prospective RA patient study cohorts and 10 disease-free controls were studied. One RA cohort was treated with adalimumab, and the other RA cohort was treated with MTX. Summary demographic information on both patient cohorts and disease-free controls are shown in Table 1.
The adalimumab cohort (n ¼ 70) were recruited from UK centres to the Biologics in Rheumatoid Arthritis Genetics and Genomics Study Syndicate (BRAGGSS), previously described by Oliver et al. [15] Eligible patients were white adults with a clinician diagnosis of RA, and about to begin treatment with adalimumab for the first time for RA. The majority of patients (87%) were treated with a concurrent DMARD (including MTX). Patients were categorized as either good (n ¼ 50) or non-responders (n ¼ 20) to treatment following 3 months of therapy using established EULAR response criteria [16]. Non-responders were excluded if anti-drug antibodies were detected in serum samples by radioimmunoassay at 3 months and/or if they selfreported non-adherence. Ethics was approved by the North West 6 Central Manchester South Research Ethics Committee (COREC 04/Q1403/37) and all patients provided written consent [15].
The MTX cohort (n ¼ 85) were recruited from the Rheumatoid Arthritis Medication Study (RAMS), a UK multicentre (n ¼ 38 centres) one-year longitudinal observational study that enrolled new-onset RA patients who are about to commence therapy with MTX as their first csDMARD. Treatment response was assessed 6 months after treatment using the EULAR response criteria. Patients were then categorized as good (n ¼ 42) or non-responders (n ¼ 43) to MTX, as described above. RAMS was approved Rheumatology key messages . In the TNFi cohort, expression of 25 of the 27 modules mirrored published findings. . Reproducible changes in module expression were observed regardless of treatment response in RA patients. . One module (M3.2, inflammation related) appeared to be dependent on good response to adalimumab. Megan Sutcliffe et al. by the Central Manchester NHS Research Ethics Committee (reference 08/H1008/25) and all patients provided written informed consent [17,18].
The disease-free controls were individuals without RA, recruited under the National Repository for Healthy Volunteers (NRHV) study within the Versus Arthritis Centre for Genetics and Genomics at the University of Manchester. Ethical approval was obtained (reference REC 99/8/84) and all volunteers gave written informed consent in compliance with Good Clinical Practice and the Declaration of Helsinki.

Transcriptome measurement
In the adalimumab cohort, whole blood gene expression profiles were captured using the Affymetrix Human Transcriptome Array 2.0 (HTA) at pre-treatment and following 3 months of treatment. In the MTX cohort, Illumina HumanHT-12 v3 Array measured whole blood expression at pre-treatment and following 4 weeks of treatment.

Statistical analysis
As two different arrays were used to acquire the transcript level gene expression data, specific packages that corresponded to each array were used to extract the raw transcript expression data. Pre-analysis quality control steps for the adalimumab and MTX-treated cohorts have been previously described [15,18].
Briefly, all array files were processed using R (version 3.6.1). For HTA array data, the annotation package pd.hta.2.0 was used for platform design. The affy package was used to summarize probe level data into a single expression value for each transcript, before transcripts were quantile normalized and log 2 transformed. The hta20transcriptcluster.db and biomaRT packages were used to map Affymetrix probe identifiers to the corresponding Entrez gene identifier (Entrez ID). For HumanHT-12 array data, GenomeStudio software evaluated bead-level expression and the Bioconductor package, limma was used for quality control. Probes that were not expressed or mapped to more than one genomic location were removed. Data were then log 2 transformed and quantile normalized. The illuminaHumanv4.db package was used to annotate transcripts with Entrez IDs.
The PCAmethods package was used to calculate principal components and to assess potential run order effects or outlier samples. The limma package was used to test for differential expression between pre-treatment and on-treatment time points in good and nonresponders separately (e.g. change in transcription between baseline and 3 months in good responders to adalimumab). Statistical models included baseline DAS28, age, gender, concurrent DMARD use, HAQ scores, smoking habits (never, past or current smoker) and array weights (calculated using the arrayWeights() function) as fixed effects, and patient ID as a random effect as previously described [15,18].
For both patient cohorts the log fold change, average expression, t-statistic and P-value returned from the limma eBayes function was stored for modular gene expression analysis.

Modular analysis of transcriptome
Data from each cohort were analysed according to the methods described by Oswald et al. [13] using the 27 Pre-defined gene co-expression modules https://academic.oup.com/rheumatology gene co-expression modules previously defined by Chaussabel et al. [14] and were numbered M1.1-M3.9. Modules were labelled in accordance with the identity of gene transcripts present in modules e.g. platelets, B cells, cytotoxic cells. Where modules were not easily characterized, no biological name was given and they remained defined as a number [13]. First, probe IDs were mapped to their corresponding Entrez ID and the transcriptomic data generated by Chaussabel et al.
were also checked to ensure up-to-date Entrez IDs were mapped to the Affymetrix identifier in the original study [14].
For good-and non-responders, probes with a significant change in expression (P-value <0.05) were identified between pre-and post-treatment samples using limma. The proportion of significantly changed probes that had a positive or negative fold change in expression within each module was calculated. Contingency tables were computed for each module and a Fishers exact test determined which modules showed statistically significant changes in expression in good and nonresponders, separately. The level of significance was corrected for multiple comparisons between the 27 modules using the Bonferroni correction algorithm (P-value <8.5e-03).

Comparison of patients and disease-free controls
To test if changes in module gene expression in goodand non-responders had transitioned towards a disease-free state, comparisons were made between patients (at baseline and after treatment) and diseasefree controls. The Wilcoxon Signed-Rank Test was used to determine whether the disease-free controls were sex-matched. Gene expression data from disease-free controls were pre-processed using the methods described for the adalimumab cohort. Batch effects between patients and disease-free controls were assessed using principal components analysis and then corrected for using the ComBat function within the sva package in R. No correlation between the first principal component and age, gender, DMARD use, baseline DAS28, DAS28 components, RIN, or RNA extraction batch Probes that significantly changed in expression between baseline and follow-up were identified. The lme4 package was used to compare probe expression within each module using linear mixed models at baseline and after treatment, including a fixed-effect for patient/control status, and independent random effects for probe ID and patient ID. Finally, densities were plotted using ggplot2 to visualize differences in baseline, follow-up and diseasefree control module expression in the good and nonresponders to adalimumab separately.

Data availability
De-identified data presented in this manuscript are available via Figshare data using the following link: https:// doi.org/10.48420/17061680.v1.

Adalimumab cohort
Consistent with the findings from three independent USrecruited RA cohorts, previously described by Oswald increased in expression between baseline and 3 months, in good-and non-responders. One module (M1.6) significantly increased (P <8.5e-03) in expression in the good responders, but not in non-responders.
Module M2.2 (neutrophil biology) significantly decreased in expression in good responders to adalimumab, but not in non-responders. Conversely, expression of module M2.9 significantly decreased in the nonresponder group, while expression did not significantly change in good responders.
One module, module M2.7, did not reflect published findings, as between pre-treatment and 3 months, module expression significantly increased (P <8.5e-03) in good responders to adalimumab. However, no change in module M2.7 expression (P >0.005) was demonstrated in the three independent US-recruited cohorts reported by Oswald et al. [13]. The function of module M2.7 is currently undetermined.

MTX cohort
Change in module expression was also assessed in RA patients whom had commenced treatment with MTX for the first time, where significant changes in modular expression were observed between baseline and 4 weeks (P-value <8.5e-03, Fig. 2 (Figs 1 and 2) and the three independent US-recruited RA cohorts, previously described by Oswald et al. [13]. One of these modules was associated to B-cell biology (M1.3), and another to erythrocyte biology (M2.3). The same opposing direction of change in modular expression was observed in nonresponders to MTX for module M1.1 (plasma cells) and M3.7.

Module expression changed towards a disease-free state
To determine whether module expression transitioned towards a disease-free state, module expression was compared between TNFi-treated patients and diseasefree controls. Disease-free controls were sex-matched 1:1 (Wilcoxon Signed-Rank Test, P ¼ 0.79), but it was  not possible to age-match controls and patients due to the nature of sample collection. For 22 of the 27 modules, difference in module expression between patients and controls was reduced after treatment, suggesting module expression transitioned in the direction of a disease-free state.
Only one module -M3.2 (inflammation) -showed a statistically significant difference (P <8.5e-03) in expression in good responders vs non-responders to adalimumab. By 3 months, module M3.2 expression remained statistically and significantly different to disease-free controls (P ¼ 0.041), while no difference was detected between good-responder patients and disease-free controls (P ¼ 0.190), i.e. good responders to adalimumab had transitioned to a disease-free state whereas nonresponders had not (Table 2).

Discussion
The discovery of reliable biomarkers has been hampered by lack of replication in published studies. However, a recent study by Oswald et al. [13] reported consistent changes in modular gene expression in good responders to TNFi therapy across three independent US RA cohorts. In this current study, we aimed to determine whether modular changes in gene expression during early TNFi treatment were observed in UK patients and, by investigating a MTX-treated cohort, determine whether the effects were TNFi-specific.
We identified a subset of gene co-expression modules that demonstrated consistent and statistically significant changes in expression between pre-and post-treatment within TNFi-treated patients; however, most of these modules changed irrespective of treatment response. Six of these modules significantly transitioned towards a disease-free state due to the effect of adalimumab.
In MTX-treated patients, the expression of two modules (M3.2 and M2.1) significantly changed in the same direction as the TNFi cohort, suggesting non-drug specific changes. However, in patients characterized as a good responder to MTX, eight modules (M1.3, M1.8, M2.3, M3.4, M3.6, M3.7 and M3.9) significantly changed in the opposite direction to TNFi-treated patients and thus appear to be drug-specific effects. This was also seen in the non-responder cohort for module M3.7.
Overall, less consistent changes in module expression were observed in the MTX cohort compared with the TNFi cohort and the three independent RA cohorts reported by Oswald et al. [13] Only two modules (M1.5 related to myeloid linage and M3.2 related to inflammation) displayed significant and consistent changes in module expression in the MTX cohort, the TNFi cohort and the three independent cohorts studied by Oswald et al. [13]. Just one of the two modules (M3.2) was different in adalimumab responders vs nonresponders.
The lack of consistency observed in the MTX cohort could be due to the earlier sampling time-point (4-week) compared with the 3-month follow-up in the TNFitreated RA cohorts. Typically, RA patients will not have responded to treatment with MTX by 4 weeks, and will still be in an inflammatory state [19]. Therefore, we cannot exclude the possibility that a later sampling timepoint of 3 months may have revealed more consistent changes in module expression for MTX. Furthermore, the modest changes in module expression observed in the MTX cohort may reflect less improvement in disease activity by 4 weeks, in comparison to the 3-month follow-up time point in the TNFi cohorts. However, disease activity measures were not available at the 4-week time point to confirm this. Alternatively, the smaller magnitude of change in module expression, and inverse trends in module expression observed in the MTX cohort, could be biologically revealing and some changes in module expression may be TNFi-driven. Additional research is therefore needed to determine the impact of sampling time point on potential drug-specific effects.
Findings from module M3.2 are potentially clinically interesting as responders to adalimumab statistically and significantly transitioned to a disease-free state whereas non-responders did not. Future studies could compare module expression with changes in CRP, an established measure of inflammation used to determine disease activity. Measuring the expression of module M3.2 in conjunction with CRP could provide an improved biological measure of response.
Here we investigated the extremes of response by focussing on EULAR good-and non-responders; however, EULAR grouping is based on the DAS28 that is made up of both objective (swollen joint count, blood marker of inflammation) and subjective (tender joint count and patient assessment of wellbeing) measures [20]. Our results support the argument for the need of a more objective measure of treatment efficacy as biological improvement towards a disease-free state was observed for some modules in both DAS28-derived good and non-responders. This could suggest clinical misclassification of response, though was minimized by using extremes of treatment response. Nonetheless, an objective biological measure of treatment efficacy would allow this to be explored. Alternatively, it reveals that the drug has an effect on gene expression independent of treatment efficacy. The modules that showed significant differences between good-and non-responders should be prioritized for mechanistic studies to understand how treatment effects are mediated.
The comparison between patients and disease-free individuals demonstrated a significant change in module expression for seven modules towards a disease-free  <0.05). The effect size shows the relative difference in transcript expression between patients and controls. A negative effect size reflects a higher level of module expression in the patients compared with the controls, and a positive effect size reflects higher module expression in the controls compared with patients.
state following 3 months of TNFi treatment. Of these seven modules, six modules transitioned towards the expression seen in disease-free controls in the group of TNFi non-responders, with only one module, module M3.2, transitioning more in good responders than nonresponders. Transcripts in module M3.2 are immune and inflammation related, relevant to RA pathophysiology, including genes involved in TGF-beta, TNF signalling, apoptosis and lipopolysaccharide biology [14]. Furthermore, genes in module M3.2, such as NFKBIE, IRF2BP2, MAPKAP-K2, IL1B and IFRD1 map to the IFN type 1 signalling pathway [21][22][23][24][25][26]; a pathway linked to TNFi response in the RA literature [27][28][29]. Moreover, an enrichment of genes involved in IL-1, IL-17, IL-13, IL-4 and IL-10 signalling, plus genes involved in STAT3 modulation; a Th17 transcription factor, are in module M3.2 [30][31][32]. Further research is required to assess the potential of this module, or a combination of relevant modules, in predicting treatment response.
Other studies have identified gene expression modules in transcriptomics datasets derived from synovial tissue, which is also a heterogeneous population of cells. For example, Aterido et al. used a weighted correlation network analysis approach to identify coexpressed modules of genes in synovial tissue from RA patients treated with a TNFi [6]. In that paper, several modules, including a module enriched for genes related to nucleotide metabolism, were statistically associated with TNFi response; however, a direct comparison with the findings of the current study is not possible due to the different ways the modules were defined. In the Aterido paper, associations between gene expression modules and TNFi response were further corroborated using genetic datasets. In the future, it will be important to perform genetic analysis of gene expression modules defined by Chaussabel et al. However, this was beyond the scope of the current study.
A limitation of this and other similar studies is that gene expression data were derived from whole blood, therefore limiting the interpretation of pathway-specific changes in gene expression that rely on specific cell populations to drive changes in module expression. As the modules analysed here were originally defined in whole blood, and one of the aims of this study was to reproduce previously published findings by Oswald et al. [13], we chose to measure gene expression levels in whole blood samples. In the adalimumab-treated patients, we observed changes in modular gene expression in the same direction as those observed by Oswald et al. These findings therefore add to the evidence that whole blood transcriptomics analyses have the potential to identify important biomarkers of treatment response. Future analysis assessing changes in transcript expression in isolated cell populations would be useful to explore pathway-specific effects.
It was not possible to determine whether modular changes were solely driven by TNFi in the adalimumab cohort as the majority of adalimumab-treated patients were concomitantly treated with methotrexate (Table 1). Methotrexate was ineffective at controlling disease symptoms, thereby necessitating escalation to biologics, therefore it could be that the observed changes in modular expression resulted from a combined effect from methotrexate and adalimumab therapy.
A further limitation was that control data were generated separately to the patient data, resulting in a batch effect that was subsequently corrected for. However, as all data were treated identically, batch correction should not have affected the observed differences when comparing good-or non-responders to disease-free controls.
Lastly, a limitation of the analysis is that EULAR was measured at two different time points: 3 months in the adalimumab cohort and 6 months in the MTX cohort. We analysed the 3-month EULAR expression data for the MTX-treated patients, and in that analysis ( Supplementary Fig. S1, available at Rheumatology online) we did observe module expression changed in the same direction as the 6-month time point, but these changes did not reach statistical significance. This was likely due to the 3-month response data being available for only 19 good responders and 22 non-responders to MTX. The fewer sample numbers meant this analysis had reduced statistical power.
A strength of this study is that the addition of control data enabled investigation of the direction of change in modular expression in the context of molecular health. Despite the low number of controls, significant and consistent changes in module expression, in the direction of a disease-free state, were observed.
Our findings provide additional evidence for the utility of transcriptomic data to develop prediction tests or monitor treatment pathways that are responsive to treatment in RA and potentially in other immune-mediated inflammatory diseases where TNFi are similarly effective.

Conclusion
In summary, we have replicated changes in module gene expression that were observed by Oswald et al. in the adalimumab-treated patients [13]. Further research is needed to investigate the effect of sampling time point on modular response to MTX. Further well-powered studies of genes within module 3.2, a module that constitutes transcripts involved in inflammation, are also warranted for TNFi response classification and the potential use of module 3.2 for prediction modelling.
Medtech and In Vitro Diagnostics Co-operative. A.B. is a NIHR Senior Investigator. Her research is supported by the NIHR Manchester BRC and the Centre for Genetics and Genomics Versus Arthritis. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.
Funding: No specific funding was received from any bodies in the public, commercial or not-for-profit sectors to carry out the work described in this article.
Disclosure statement: The authors have declared no conflicts of interest.

Data availability statement
Data are available upon reasonable request by any qualified researchers who engage in rigorous, independent scientific research, and will be provided following review and approval of a research proposal and Statistical Analysis Plan (SAP) and execution of a Data Sharing Agreement (DSA). All data relevant to the study are included in the article.

Supplementary data
Supplementary data are available at Rheumatology online.